Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Praharsh Churi, Yadnesh Patil, Sahil Patil, Shivendra Patil, Prof. Anand Magar
DOI Link: https://doi.org/10.22214/ijraset.2023.56981
Certificate: View Certificate
A crucial component of human expression, music has the astonishing power to provoke a wide range of emotions. In this study, we describe a revolutionary method for creating music that integrates Indian classical music, computer vision, and emotion analysis. In our project, named \"Emotion-Based Music Generation,\" we use Media Pipe for facial expression recognition, Keras for music creation, OpenCV for real-time webcam access, and stream lit-WebRTC for web application development. Based on the Nava Rasas in Indian classical music, the technique isolates nine fundamental emotions and develops musical compositions in accordance. These technologies are used to create an enjoyable and interactive system that allows users to explore the emotional spectrum of music.
I. INTRODUCTION
Music is a universal language that cuts over linguistic, cultural, and geographic barriers. It has a special ability to arouse feelings, bring back memories, and forge strong bonds with the human soul [17]. Music has the power to move our emotions and lift our spirits, whether it's the ecstasy of a joyous symphony, the serenity of a lullaby, or the melancholy of a soulful ballad. Our relationship with this ancient form of expression is constantly being redefined because to the mix of technology and artistry in the world of music [3].
Our initiative, "Emotion-Based Music Generation," makes a ground-breaking advancement in this area by utilising the power of contemporary technology to produce music that is not only aural but also emotive. We set out on a mission to fully use the expressive power of technology by drawing inspiration from the rich tapestry of emotions represented in the Nava Rasas of Indian classical music—Love, Joy, Surprise, Sadness, Anger, Disgust, Fear, Peace, and Courage [19]. Our project aims to analyse the emotional landscape of people in real-time and transform their ever-changing expressions into mellow and resonant musical compositions by integrating powerful computer vision, deep learning, and real-time webcam access [1].
The relevance of this project rests in its potential to fundamentally alter how we interact with music in the age of digital interconnection and creative inquiry. The system provides a dynamic and immersive platform for users to interact with their emotions through music [2]. It is supported by Media Pipe’s facial expression recognition, Kera’s' music composition abilities, OpenCV's real-time camera access, and Stream lit-WebRTC's user-friendly interface. Our Emotion-Based Music Generation method opens up new avenues for creative expression, whether one is looking for a therapeutic release, a creative outlet, or simply a fresh way to listen to music [3].
This essay explores the subtleties of our process, from the real-time analysis of facial expressions to the precisely constructed musical compositions that represent emotions [2]. Additionally, it discusses the empirical findings of user testing, which shed light on the efficiency and usability of the system. Additionally, we look at possible routes for this research's future, emphasising the potential for enhancing the emotional palette, facilitating user personalization, and promoting collaborative music creation. This project's ultimate goal is to redefine music while also enabling people to examine the complex connection between their emotions and the melodies that enrich their lives [4].
II. LITERATURE SURVEY
There has been very little, or no research done on NICM-based mood detection [12]. After consulting the literature on music mood classification related to western music and the features used in it, as well as after studying various emotional models like Thayer's Model [13], Russell's Circumplex Model [5], and Raga - Nava Rasa theory [10], we were able to determine the connection between Indian Classical Music features and mood. The first two steps in NICM's mood mapping method are audio feature extraction and multi-label classification into moods [13]. The primary emotional influencing elements of Indian classical music are melody and rhythm. According to the vocabulary of Nava Rasa, the class labels' mood categories
A. Emotion Recognition Using Computer Vision
A developing area of study is the application of computer vision methods for emotion recognition. The article "Emotion Recognition in the Wild" by Zafeiriou et al. (2017) is a landmark in this field since it introduces a deep neural network-based method to identify emotions from facial expressions in real-world situations. With the help of systems like MediaPipe, which offers pre-trained models for facial emotion identification, the "Emotion-Based Music Generation" project is able to analyze emotions in real-time. This work served as the basis for these systems [2].
B. Music Generation using Machine Learning
With the emergence of machine intelligence, music generation has seen major breakthroughs. A notable contribution is Google's "Magenta" project, which examines the relationship between music and machine learning. Magenta provides a variety of models and tools for composing music, such as deep learning-based methods like the "Magenta Studio" (Simon et al., 2017), which use LSTM networks for music composition. For Kera’s to be included in the project, this study served as the foundation.
C. Real-Time Computer Vision and Web Technologies
Real-time computer vision and online technologies are advantageous to the "Emotion-Based Music Generation" project. In real-time video analysis, OpenCV, an open-source computer vision library, is frequently used. Its incorporation with online technologies like Streamlit-WebRTC, as used in this project, is in line with the widespread movement to enable real-time and interactive experiences within web applications [3].
D. Emotional Music Generation and Nava Rasas
In "Emotional Music Generation: The Next Step in Music Evolution" by Saari et al. (2017), which offers a framework for producing music based on emotional states, the topic of music generation has been examined in relation to emotions. Additionally, in keeping with the cultural setting of the "Emotion-Based Music Generation" project, the "Nava Rasas" framework from Indian classical music offers a wealth of inspiration for relating emotions to musical compositions [10].
E. User Experience and Interactive Applications
The "Emotion-Based Music Generation" project builds on previous work in the area of user experience and interactive applications, including "Real-Time Interactive Applications with WebRTC" by Rescorla (2012), which demonstrates the potential of WebRTC in creating real-time and interactive web applications, thereby boosting user engagement [3].
III. METHODOLOGY
The "Emotion-Based Music Generation" project uses a complex and diverse methodology that smoothly merges music composition, real-time webcam access, computer vision, machine learning, and web application development. Each element is essential in building a comprehensive system that can identify emotions in the moment and translate them into melodic musical creations [13]. The main procedures and steps of the project are more thoroughly described in this extended approach.
A. Emotion Detection with MediaPipe
MediaPipe, a Google-developed open-source computer vision framework, serves as the methodology's cornerstone. Real-time facial emotion identification utilising MediaPipe's pre-trained deep neural network models constitutes the initial step. These models are made to recognise facial landmarks and then extrapolate emotions from the locations of those landmarks on the face. The approach classifies a variety of emotions, including Love, Joy, Surprise, Sadness, Anger, Disgust, Fear, Peace, and Courage, in accordance with the Nava Rasas of Indian classical music [19]. Real-time emotion recognition happens while the webcam records the user's facial expressions [8].
B. Emotion-to-Music Mapping
Once feelings have been identified, the next stage is to associate those feelings with particular musical elements. The approach makes use of well-established concepts from Indian classical music theory to achieve this. The Nava Rasas give this mapping a culturally appropriate foundation and enable an emotionally resonant link between the recognised emotion and the matching musical traits [10]. For instance, Joy might be represented by uplifting melodies, pleasing harmonies, and a brisk tempo, whereas Sadness might be represented by slower, melancholy tunes. Iterative testing and user input are used to improve this mapping process[8].
C. Music Generation with Keras
The system uses music production from the Keras library to bring the mapped emotions to life. The Keras-based model is developed using a large dataset of previously composed musical works, each of which has been emotionally labelled. The machine learns and comprehends the association between emotions and musical structures using this dataset as its training set[10]. The system uses long short-term memory (LSTM) networks and recurrent neural networks (RNNs) to create melodies, harmonies, and rhythms that correspond to the recognised emotions. To achieve a high-quality output, the model goes through ongoing training and improvement [1].
D. Real-time Webcam Access with OpenCV
A camera user's video feeds are recorded using OpenCV, an open-source computer vision library. By continuously analysing and processing the webcam frames, the OpenCV component makes sure the system runs in real-time. In order to enable dynamic and in-the-moment emotion identification, these frames are supplied into the MediaPipe emotion recognition module [9].
E. Web Application with Streamlit-WebRTC
A web application is made using Streamlit-WebRTC, an addition to the Streamlit framework, to make the system accessible and user-friendly [3]. By allowing the system access to their webcam, users can access the application through a web browser [11]. The web application shows the user's live video feed combined with the produced music to give the user a smooth and immersive experience. Real-time interaction with the generated music is made possible by this dual visualization, which also increases user engagement [6].
F. User Interaction and Feedback Loop
The methodology's usage of user engagement and feedback is essential. The initiative encourages users to express a range of emotions in order to engage them in real-time. Users' emotional responses to the system's music are continuously analyzed to inform the musical composition. With the help of this feedback loop, the created music is kept in tune with the user's emotional state, resulting in a dynamic and unique experience.[18]
G. Ethical Considerations
An important focus of the "Emotion-Based Music Generation" project is on ethical issues, such as user consent, data privacy, and user well-being. Users are given the choice to participate and are openly informed about the webcam access. User data is rigorously protected by privacy safeguards [15]. A safe and healing musical experience is another goal of the system, and any potential emotional triggers are carefully avoided .
IV. MEANING AND APPLICATION OF NAVA RASAS IN INDIAN CLASSICAL MUSIC
For the purpose of creating music based on different emotions, we have taken into consideration nava rasas, or the nine emotional states of Indian classical music[10].
These are these sentiments' [19]:
V. RESULTS
A group of users participated in the system's testing for emotion-based music generation. The music was said to be in tune with the participants' reported emotional states, according to the algorithm. The application's real-time functionality made for a fun and participatory experience. The ability to fine-tune the emotion-to-music mapping and enhance the precision of emotion recognition has greatly benefited from user feedback.
VI. FUTURE SCOPE
The project "Emotion-Based Music Generation" is still active and has promising potential for advancements in the future:
A larger spectrum of emotional states can be recognised by extending the emotion recognition skills, which will enable the creation of more subtle music.
User Customization: Give consumers the option to alter many aspects of the music-generation process, such as the tempo, musical instruments, and musical style.
Implement collaborative music generation so that several users can make real-time contributions to a shared musical creation.
Real-time Music Composition: Create algorithms that enable modifications to real-time music composition based on user input and feelings displayed throughout the encounter.
These new paths will improve the project's capabilities and increase its adaptability to different use cases.
VII. ACKNOWLEDGEMENT
Without the assistance of our mentor Prof. Anand Magar, of the Vishwakarma Institute of Technology in Pune, this project model and article would not have been possible. We were successful in completing this prototype system due to the active involvement of every teammate. Finally, we would want to express our gratitude to the university and institute for providing this opportunity and for their unwavering support.
The Emotion-Based Music Generation project shows how technology and art may be used to provide engaging and emotionally impactful experiences. This study opens up new directions for investigating how artificial intelligence and human expressiveness interact. With a larger and more varied dataset for the mapping of emotions to music and ongoing improvement of the music generating model, the system can be further enhanced. We envisage a wider range of uses in the future, ranging from the production of healing music to entertainment and artistic expression.
[1] Patrik N. Juslin and Petri Laukka, Expression, Perception, and Induction of Musical Emotions: A Review and a Questionnaire Study of Everyday Listening, Journal of New Music Research, Vol. 33, No. 3, pp. 217–238, 2004. [2] V.N. Bhatkande. HindusthaniSangeetPaddhati. SangeetKaryalaya, 1920-1979. [3] https://www.researchgate.net/publication/331344763_Peer_to_Peer_Multimedia_Real-Time_Communication_System_based_on_WebRTC_Technology [4] Sharangadev, SangeetRatnakara, 1210–1247. [5] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367156/ [6] Monojit Choudhury, PradiptaRanjan Ray, Measuring Similarities across Musical Compositions: An Approach Based on the Raag Paradigm, Journal of the ITC-SRA, vol.17 2003. [7] Anirban Patranabis et al., Measurement of Emotion induced by Hindusthani Music –A Human response and EEG study, Journal of ITC Sageet Research Academy –Ninaad, Vol. 26-27, December 2013. [8] Asmita Chatterji and Dana Ganu, “A Framework for Understanding the Relation Between Music And Emotions”, Journal of ITC Sangeet Research Academy, Vol. 26-27, December, 2013. [9] Alicja A Wieczorkowska, Ashoke Kumar Datta, Ranjan Sengupta,Nityananda Dey, and Bhaswati Mukherjee, “On Search for Emotion in Hindusthani Vocal Music”, Advances in Music Information Retrieval, SCI 274, pp. 285–304, Springer-Verlag Berlin Heidelberg, 2010. [10] https://www.researchgate.net/publication/330466789_Application_of_the_Navarasa_Theory_in_Architecture [11] Parag Chordia and Alex Rae, “Understanding emotion in raag: An empirical study of listener responses,” Computer Music Modeling and Retrieval, Sense of Sounds, pp. 110–124, 2009. Chuan-Yu Chang, Chi-Keng Wu, Chun-Yen Lo, Chi-Jane Wang, Pau-Choo Chung, Music Emotion Recognition with Consideration of Personal Preference, IEEE transactions on Multimedia, 2011. [12] https://www.researchgate.net/publication/318760264_Analysis_of_Features_for_Mood_Detection_in_North_Indian_Classical_Music-A_Literature_Review [13] Tao Li, MitsunoriOgihara, 2004. Content-Based Music Similarity Search and Emotion Detection”, International conference on Acoustics, Speech and Signal Processing (ICASSP 2004), 705- 708. [14] https://www.researchgate.net/figure/Thayers-model [15] Hiba Ahsan, Vijay Kumar and C.V. Jawahar, “Multi-Label Annotation of Music”, IEEE 2015 [16] L. Lu, D. Liu, and H. J. Zhang, “Automatic mood detection and tracking of music audio signals,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 1, pp.5–18, 2006 [17] D. Liu, L. Lu and H-J. Zang, “Automatic Mood Detection from Acoustic Music Data” ISMIR 2003 [18] G. Peeters, “A generic training and classification system for MIREX08 classification tasks: Audio music mood, audio genre, audio artist and audio tag,”MIREX 2008 [19] https://www.researchgate.net/publication/318760264_Analysis_of_Features_for_Mood_Detection_in_North_Indian_Classical_Music-A_Literature_Review
Copyright © 2023 Praharsh Churi, Yadnesh Patil, Sahil Patil, Shivendra Patil, Prof. Anand Magar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET56981
Publish Date : 2023-11-24
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here